Speech recognition in car noise environments using multiple models according to noise masking levels

نویسندگان

  • Myung Gyu Song
  • Hoi In Jung
  • Kab-Jong Shim
  • Hyung Soon Kim
چکیده

In speech recognition for real-world applications, the performance degradation due to the mismatch introduced between training and testing environments should be overcome. In this paper, to reduce this mismatch, we provide a hybrid method of spectral subtraction and residual noise masking. We also employ multiple model approach to obtain improved robustness over various noise environments. In this approach, multiple model sets are made according to several noise masking levels and then a model set appropriate for the estimated noise level is selected automatically in recognition phase. According to speaker independent isolated word recognition experiments in car noise environments, the proposed method using model sets with only two masking levels reduces average word error rate by 60% in comparison with spectral subtraction method.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions

Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...

متن کامل

Clean speech feature estimation based on soft spectral masking

In this paper, we first analyze the problems of speech and noise contamination process in noise-masking point of view, and propose a new approach to estimate degree of noise masking effect on clean speech distribution model based on sequential noise estimation. Sequential noise estimation is performed frame-by-frame using interacting multiple model (IMM) algorithm, so that realtime implementati...

متن کامل

A Unified Approach of Compensation and Soft Masking Incorporating a Statistical Model into the Wiener Filter

In this paper, we present a new single-channel noise reduction method that integrates compensation and soft masking into the same statistical model assumptions for noise-robust speech recognition. By utilizing a Gaussian mixture model(GMM) as a pre-knowledge of speech and added noise signals, the proposed method can effectively restore clean speech spectra and separate out ambient noises from a...

متن کامل

A New Method for Speech Enhancement Based on Incoherent Model Learning in Wavelet Transform Domain

Quality of speech signal significantly reduces in the presence of environmental noise signals and leads to the imperfect performance of hearing aid devices, automatic speech recognition systems, and mobile phones. In this paper, the single channel speech enhancement of the corrupted signals by the additive noise signals is considered. A dictionary-based algorithm is proposed to train the speech...

متن کامل

Reduced complexity equalization of lombard effect for speech recognition in noisy adverse environments

In real-world adverse environments, speech signal corruption by background noise, microphone channel variations, and speech production adjustments introduced by speakers in an effort to communicate efficiently over noise (Lombard effect) severely impact automatic speech recognition (ASR) performance. Recently, a set of unsupervised techniques reducing ASR sensitivity to these sources of distort...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998